Cheminformatics-Guided Exploration of Synthetic Marine Natural Product-Inspired Brominated Indole-3-Glyoxylamides and Their Potentials for Drug Discovery

Marine natural products (MNPs) continue to be tested primarily in cellular toxicity assays, both mammalian and microbial, despite most being inactive at concentrations relevant to drug discovery. These MNPs become missed opportunities and represent a wasteful use of precious bioresources. The use of cheminformatics aligned with published bioactivity data can provide insights to direct the choice of bioassays for the evaluation of new MNPs. Cheminformatics analysis of MNPs found in MarinLit (n = 39,730) up to the end of 2023 highlighted indol-3-yl-glyoxylamides (IGAs, n = 24) as a group of MNPs with no reported bioactivities. However, a recent review of synthetic IGAs highlighted these scaffolds as privileged structures with several compounds under clinical evaluation. Herein, we report the synthesis of a library of 32 MNP-inspired brominated IGAs (25–56) using a simple one-pot, multistep method affording access to these diverse chemical scaffolds. Directed by a meta-analysis of the biological activities reported for marine indole alkaloids (MIAs) and synthetic IGAs, the brominated IGAs 25–56 were examined for their potential bioactivities against the Parkinson’s Disease amyloid protein alpha synuclein (α-syn), antiplasmodial activities against chloroquine-resistant (3D7) and sensitive (Dd2) parasite strains of Plasmodium falciparum, and inhibition of mammalian (chymotrypsin and elastase) and viral (SARS-CoV-2 3CLpro) proteases. All of the synthetic IGAs tested exhibited binding affinity to the amyloid protein α-syn, while some showed inhibitory activities against P. falciparum, and the proteases, SARS-CoV-2 3CLpro, and chymotrypsin. The cellular safety of the IGAs was examined against cancerous and non-cancerous human cell lines, with all of the compounds tested inactive, thereby validating cheminformatics and meta-analyses results. The findings presented herein expand our knowledge of marine IGA bioactive chemical space and advocate expanding the scope of biological assays routinely used to investigate NP bioactivities, specifically those more suitable for non-toxic compounds. By integrating cheminformatics tools and functional assays into NP biological testing workflows, we can aim to enhance the potential of NPs and their scaffolds for future drug discovery and development.


Introduction
Due to their enzymatic biosynthesis and evolutionary encoded interactions with the proteins and receptors of living systems, NPs are a valuable source of potential therapeutics that continue to be leveraged for drug discovery.Currently, MNPs have provided the source and/or inspiration for 15 approved therapeutics worldwide, mainly for the treatment of cancers, with dozens more positioned in various phases of clinical trials [1].Despite this, the proportion of NP chemical space that is biologically relevant-that is, chemical space exclusive to bioactive NPs-remains vastly underexplored [2].Recently, we published two meta-analyses that outline limiting factors contributing to this apparent lack of understanding of biologically relevant NP chemical space [3].Our first meta-analysis examined the biological activities reported for new MNPs between 2016 and 2020 and revealed that 68% were neither cytotoxic, antimicrobial, nor anti-inflammatory and that most MNPs were not active in the screens they were tested in [3].These findings were supported by a subsequent meta-analysis of MIA diversity and bioactivity, with the majority of MIAs examined for cytotoxicity and antimicrobial activities despite around 90% of compounds not being active at concentrations relevant to drug discovery in these assays [3].Both meta-analyses clearly showed that MNPs continue to be examined against a narrow range of disease and protein targets, particularly when compared to synthetic drug and drug-leads, and that a lack of coherent strategies aimed at bridging the gap between MNP chemical and biologically relevant chemical space continue to limit our understanding of their biological potentials [3].
A common limitation for expanding NP biologically relevant chemical space (aside from the challenges associated with procuring academic funding, access to core facilities, and the availability of expertise required for investigating new biological targets) is access to sufficient quantities of compounds for random biological screening against diverse targets.Reasons that limit the supply of NPs include low isolation yield, access to producing organism/s, complexities and costs associated with re-isolation (inactivation of microbial biosynthetic gene clusters, for example), and unsustainable bioprospecting of NP-producing plants and animals.The structural complexity of some NPs makes them challenging to synthesise, but if they are amenable to synthesis, this helps to alleviate compound availability and promotes diversified biological evaluations [4].However, without good reason, synthesising a complex natural product on the off chance that it may show potent activity in an untested assay is an extremely risky and expensive endeavour.Predictive computational approaches, such as cheminformatics-based ones, have become important tools for analysing the information available in what are increasingly large databases, thereby promoting more directed research efforts [3,[5][6][7].Several curated NP databases exist, such as MarinLit [8], SuperNatural 3.0 [9], and COCONUT [10], each of which houses extensive collections of chemical and biological data associated with NPs, can be effectively analysed using cheminformatics and other computational tools.Orthogonal approaches that combine cheminformatics-based analyses of large NP databases with the synthesis of NP scaffolds provide excellent platforms aimed at expanding biological testing opportunities and what is known about the biologically relevant chemical space of these important natural resources.In fact, we have recently shown that cheminformatics tools integrated with NP bioactivity information successfully directed the antibacterial evaluation of a series of marine-inspired bis-indole alkaloids [11].
Despite reduced funding from pharmaceutical and industry sectors, the reporting of MNPs continues to grow annually (MarinLit database contains 40,543 MNPs as of August 2023) [8], driven primarily by a strong global academic network [3].MIAs represent a diverse sub-class of MNPs ubiquitous to most marine phyla, with 2048 reported in the MarinLit database to the end of 2023 [8].Moreover, the indole motif is considered a privileged structure in drug discovery due to its potential for biological activity with several indole drugs approved for the treatment of disease and infection targets [3,12].During our recent meta-analysis of MIA diversity and bioactivity [3], we identified an interesting group of marine indol-3-yl-glyoxylamides (IGAs) with vastly underexplored bioactivities (1-24; Figures 1 and S1; Table S1).In contrast, several synthetic IGAs have been reported for a range of biological activities, including neuroprotective, antimicrobial, antiprotozoal, antiviral, antiprion, and enzymatic inhibition [13].Some of these IGAs have also found clinical use, supporting their classification as a privileged structure for drug discovery [13].
Guided by cheminformatics analyses of MNP IGA chemical diversity and their lack of reported bioactivities, we embarked upon the synthesis of marine-inspired IGAs in an effort to expand our understanding of their biologically relevant chemical space.Herein, we report a simple one-pot, multi-step synthesis of 5 or 6-brominated IGAs  incorporating a broad range of proteinogenic D and L amino acids.Guided by their structural similarity to other synthetic IGAs the 32 synthetic indoles were evaluated against a panel of mostly functional biological disease and infection targets, including binding to the Parkinson's disease amyloid protein α-synuclein (α-syn), inhibition of mammalian serine proteases (elastase and chymotrypsin), inhibition of SARS-CoV-2 3CL protease, and antiplasmodial activities.In addition, their effect on the cell viability of human embryonic kidney and three cancerous cell lines (breast, ovarian, and colon) were also examined.11), (Z)-coscinamide D (12), and staurosporine analogue 13G-291 (13).
Guided by cheminformatics analyses of MNP IGA chemical diversity and their lack of reported bioactivities, we embarked upon the synthesis of marine-inspired IGAs in an effort to expand our understanding of their biologically relevant chemical space.Herein, we report a simple one-pot, multi-step synthesis of 5 or 6-brominated IGAs (25-56) incorporating a broad range of proteinogenic D and L amino acids.Guided by their structural similarity to other synthetic IGAs the 32 synthetic indoles were evaluated against a panel of mostly functional biological disease and infection targets, including binding to the Parkinson's disease amyloid protein α-synuclein (α-syn), inhibition of mammalian serine proteases (elastase and chymotrypsin), inhibition of SARS-CoV-2 3CL protease, and antiplasmodial activities.In addition, their effect on the cell viability of human embryonic kidney and three cancerous cell lines (breast, ovarian, and colon) were also examined.

Cheminformatics Analyses of Marine Natural Product Indol-3-yl-Glyoxylamides
Self-organising maps (SOM) provide excellent tools for visualising chemical diversity with structurally related molecules positioned within shared regions or clusters of twodimensional space based on molecular fragment similarity scoring using neural network algorithms.Using the freely available DataWarrior (version 6.01.05) software [14], a selforganising map (SOM) of MIAs (n = 2048) reported to the end of 2021 was generated from the MarinLit database (Figures S2 and S3) [8].The SOM's complex background topography (Figure S2) was populated with MIAs (coloured according to producing marine phyla) arranged in-space according to structural similarity, which clearly emphasised the chemical diversity present within MIAs (Figure S3).Structural filtering of the MIA SOM for those containing the IGA motif identified 24 compounds (1-24, Figure 2A), almost all of which reported from marine invertebrates (Chordata n = 11, Porifera n = 11, and Molluscs n = 1) with the exception of IGA 13G-291 ( 13), a staurosporine-substituted IGA isolated from a Streptomyces species (Figure S3) [15].

Cheminformatics Analyses of Marine Natural Product Indol-3-yl-Glyoxylamides
Self-organising maps (SOM) provide excellent tools for visualising chemical diversity with structurally related molecules positioned within shared regions or clusters of twodimensional space based on molecular fragment similarity scoring using neural network algorithms.Using the freely available DataWarrior (version 6.01.05) software [14], a selforganising map (SOM) of MIAs (n = 2048) reported to the end of 2021 was generated from the MarinLit database (Figures S2 and S3) [8].The SOM's complex background topography (Figure S2) was populated with MIAs (coloured according to producing marine phyla) arranged in-space according to structural similarity, which clearly emphasised the chemical diversity present within MIAs (Figure S3).Structural filtering of the MIA SOM for those containing the IGA motif identified 24 compounds (1-24, Figure 2A), almost all of which reported from marine invertebrates (Chordata n = 11, Porifera n = 11, and Molluscs n = 1) with the exception of IGA 13G-291 ( 13), a staurosporine-substituted IGA isolated from a Streptomyces species (Figure S3) [15].The chemical diversity of marine IGAs resided mainly within two clusters (red and blue circles, Figure 2A).The two clusters contained two types of IGAs: the first group (red circle, n = 14) represented IGA motifs substituted with amino acids (carboxylated or decarboxylated) and amines (red, Figure S1), while the second (blue circle, n = 5) contained IGA enamine NPs (blue, Figure S1).Singleton IGAs (n = 5) included herdmanine K (5), located with other histidine-containing MIAs, 13G-291 (13), with the large staurosporine class of compounds, preoxazinin-7 (16), with other mollusc-derived oxazinins, hyrtimomine F (17), positioned with other structurally related cyclic indoles, and pyrinodermin F (21), situated near other long chain alkyne substituted indole amides.
A meta-analysis of the biological activities reported for the 24 IGAs found that 20 had been tested for bioactivity with an average of 1.4 disease or infection targets examined per compound, a testing rate lower than the overall testing rate for MIAs (n = 1.9) [3].The biological activities reported for 1-24 were then categorised according to their reported potencies relevant to drug discovery previously described (inactive, weak, moderate, and potent, Table S2) [3].This showed that 19 of the 20 IGAs examined for bioactivity were inactive, with only one, staurosporine analogue 24; a structure class broadly recognised for anticancer activity, categorised with weak activity (Figure 2C) [15].An overview of the biological targets and the potencies of marine versus synthetic IGAs clearly outlined the narrow breadth of disease and infection targets explored and the lack of meaningful bioactivities for marine IGAs.The major testing targets for marine IGAs examined cellular toxicities.Despite the focus on cytotoxicity (n = 12) and antimicrobial activities (n = 8), all the marine IGAs tested were inactive in these assays (Figure 2D).Moreover, there was a clear lack of non-toxic disease and infection targets examined for marine IGA bioactivities compared to synthetic ones.Synthetic IGAs displayed good potency toward targets not thoroughly examined by marine IGAs, including CNS, protozoan, anti-inflammatory, and viral ones [13].However, while a small number of marine IGAs were screened in assays other than cytotoxic and antibacterial ones, including antiplasmodial and antiviral assays, all were found to be inactive [16][17][18].Other marine IGA targets included protein kinase C, where staurosporine IGA 13G-291 (13) displayed only weak inhibition [15], and PPAR-γ agonist antidiabetic activity, in which the herdmanines I and K (5 and 6) were reported with unquantifiable potencies [19].Interestingly, synthetic IGAs have shown excellent activities against CNS targets, specifically antiprion [13].This bioactivity and others, including antileishmanial and antiviral activities, led us to examine the chemical similarity of marine IGAs compared with bioactive synthetic IGAs (Figure 2B).The resultant SOM showed that marine IGAs shared good overlap with synthetic IGAs, particularly those underexamined for marine IGA bioactivity, namely CNS and antiprotozoal bioactivities.
The aforementioned meta-and cheminformatics analyses directed our attention toward the synthesis of leptoclinidamine A-C (1-3) inspired IGA scaffolds.The IGA scaffolds 1-3 overlapped favourably with bioactive synthetic IGAs and also contained structures amenable to synthesis.By synthesising a library of 5-or 6-bromoindole containing IGAs coupled with a broad range of proteinogenic D and L amino acids, we could further explore their potential bioactive chemical space.

Synthesis of Indol-3-yl-Glyoxylamides 25-56
The general synthetic procedure undertaken for the preparation of the IGA analogues reported herein (25-56, Figures 3, S4 and S5, and Table 1) was based on an NMR tube reaction we reported previously for the synthesis of (−)-leptoclinidamine A [16].Our scaled-up synthetic strategy was modified to incorporate a one-pot multi-step method beginning with diacylation of the appropriate brominated indole with oxalyl chloride under argon at room temperature for 45 min (Scheme 1).The resultant reaction mixture was then carefully heated to 50 • C to remove ether, after which DMF and anhydrous pyridine were added sequentially.The temperature of the stirring reaction mixture was increased to between 80 and 90 • C before the desired D or L amino acid was added after 15 min.The resultant reaction mixture was left heating overnight, then quenched with H 2 O (2.0 mL) and repeatedly partitioned between EtOAc and H 2 O with the organic phase dried over anhydrous sodium sulphate and concentrated in vacuo.For all reactions, except those containing amino acids with basic side chains, the pH of the mixtures was monitored and acidified with 1 M HCl (pH 3-5) to promote migration of compounds into partitioned organic phases.
Molecules 2024, 29, x FOR PEER REVIEW 6 of 26 and repeatedly partitioned between EtOAc and H2O with the organic phase dried over anhydrous sodium sulphate and concentrated in vacuo.For all reactions, except those containing amino acids with basic side chains, the pH of the mixtures was monitored and acidified with 1 M HCl (pH 3-5) to promote migration of compounds into partitioned organic phases.Following this, the crude reaction mixtures were adsorbed onto C18-bonded silica gel, loaded into a refillable guard column, and purified using preparative reversed-phase high-pressure liquid chromatography (RP HPLC) with a decreasing polarity solvent gradient from 100% H2O (0.1% TFA) to 100% MeOH (0.1% TFA) over 60 min with fractions collected each min.Analogues requiring further purification were resuspended in MeOH (100 µL), then injected onto a semi-preparative RP HPLC column eluting with an optimised solvent gradient from 40% MeOH/60% H2O (0.1% TFA) to 75% MeOH/25% H2O (0.1% TFA) over 40 min at a flow rate of 4.0 mL per min with fractions collected each min or by hand.It must be noted that other synthetic methodologies report the synthesis of indole-3-glyoxyl compounds using THF [20,21].However, due to the poor solubility of polar amino acids in THF, the resultant amide coupling reactions were low yielding with  Following this, the crude reaction mixtures were adsorbed onto C 18 -bonded silica gel, loaded into a refillable guard column, and purified using preparative reversed-phase highpressure liquid chromatography (RP HPLC) with a decreasing polarity solvent gradient from 100% H 2 O (0.1% TFA) to 100% MeOH (0.1% TFA) over 60 min with fractions collected each min.Analogues requiring further purification were resuspended in MeOH (100 µL), then injected onto a semi-preparative RP HPLC column eluting with an optimised solvent gradient from 40% MeOH/60% H 2 O (0.1% TFA) to 75% MeOH/25% H 2 O (0.1% TFA) over 40 min at a flow rate of 4.0 mL per min with fractions collected each min or by hand.It must be noted that other synthetic methodologies report the synthesis of indole-3-glyoxyl compounds using THF [20,21].However, due to the poor solubility of polar amino acids in THF, the resultant amide coupling reactions were low yielding with considerable amounts of unreacted amino acid starting material present after purification.It was found that DMF with the addition of heat (~90 • C) provided good solubilisation of all amino acids, particularly the more polar ones.considerable amounts of unreacted amino acid starting material present after purification.It was found that DMF with the addition of heat (~90 °C) provided good solubilisation of all amino acids, particularly the more polar ones.Scheme 1. One-pot, two-step synthesis of brominated indol-3-yl-glyoxylamide analogues (25-56).

Exploration of Biological Chemical Space of Brominated Indol-3-yl-Glyoxylamides (25-56)
The chemical diversity of the brominated synthetic IGAs 25-56 (n = 32) was incorporated into the dataset of MIAs (n = 2048) and synthetic IGAs (n = 147) and visualised using an updated SOM (n = 2080, 50 × 50 neuron, SkelSpheres chemical descriptor, Figure 4).Encouragingly, the brominated leptoclinidamine-inspired IGAs 25-56 occupied chemical space that both overlapped and expanded the chemical diversity already present within the marine and synthetic IGA dataset.In addition, the synthetic IGAs 25-56 also showed occupied chemical space corresponding with bioactive synthetic IGAs, including those with CNS (antiprion) and antiprotozoal activities [13].These findings directed our attention toward diversifying the testing of 25-56 against a range of biological targets, including CNS, antiprotozoal, viral and mammalian proteases, and cancer cell cytotoxicity.

Exploration of Biological Chemical Space of Brominated Indol-3-yl-Glyoxylamides (25-56)
The chemical diversity of the brominated synthetic IGAs 25-56 (n = 32) was incorporated into the dataset of MIAs (n = 2048) and synthetic IGAs (n = 147) and visualised using an updated SOM (n = 2080, 50 × 50 neuron, SkelSpheres chemical descriptor, Figure 4).Encouragingly, the brominated leptoclinidamine-inspired IGAs 25-56 occupied chemical space that both overlapped and expanded the chemical diversity already present within the marine and synthetic IGA dataset.In addition, the synthetic IGAs 25-56 also showed occupied chemical space corresponding with bioactive synthetic IGAs, including those with CNS (antiprion) and antiprotozoal activities [13].These findings directed our attention toward diversifying the testing of 25-56 against a range of biological targets, including CNS, antiprotozoal, viral and mammalian proteases, and cancer cell cytotoxicity.The misfolding and aggregation of the synaptic protein α-synuclein (syn) into amyloid fibrils termed amyloidosis, is suspected as causal in the debilitating neurodegenerative disorder Parkinson's disease (PD) [22,23].
The α-syn protein aggregation cascade, initialised by the misfolding and fibrilisation of the protein and aggregation of amyloid fibrils, has been hypothesised to lead to cellular damage and dysfunction, ultimately resulting in neuronal cell death [24].Therefore, inhibition of the progression of the α-syn protein aggregation cascade, postulated to disrupt the development of neuro-toxic forms of the amyloid protein, is an attractive small molecule therapeutic target aimed at the treatment of PD and other neurological diseases caused by amyloidosis (Alzheimer's disease, Huntington's disease, and prion protein disorders) [25].Our ongoing research in this area has reported several NPs with α-syn binding and/or anti-aggregation activities from both marine and terrestrial environments, highlighting the promise of NP scaffolds as potential therapeutics for amyloid protein diseases.We have previously reported two bioisosteres of IGAs, aplysamine-2 (57) and purealidine Q, both with micromolar anti-prion activities in a yeast-based assay (Figure 5) [25].In addition, we have reported several brominated MNPs with α-syn binding and/or protein aggregation inhibitory activities, including aerothionin (58), aerophobin-2 (59), and butenolide-containing prunolides (prunolide B, 60) and procerolides (procerolide C, 61, Figure 5) [26,27].Several of these MNPs showed dual yeast prion curing and α-syn aggregation inhibition.Given that synthetic IGAs have also been reported to potently inhibit human prion protein aggregation, this prompted us to test the synthetic brominated IGA analogues for binding affinity to α-syn using an affinity mass spectrometry assay.To broadly cover the diversity of amino acid side-chain physicochemical properties housed within our library of synthetic IGAs (and to minimise expenses associated with screening 32 compounds), 5 and 6-brominated IGAs with arginine (25 and 26), cystine (34), isoleucine (41), tryptophan (45 and 46), and tyrosine (51 and 52) were selected for screening in an in vitro affinity mass spectrometry (MS) assay.The synthetic indoles were incubated with α-syn for 3 h at a 5:1 molar ratio, after which a (+)-mode high-resolution MS spectrum was acquired.The MS binding results displayed additional peaks indicating that all the analogues tested formed protein-ligand complexes indicative of their binding with α-syn (Figures 6 and S134-S140).The α-syn binding displayed by all of the synthetic IGAs tested, 25, 26, 34, 41, 45, 46, 51, and 52, provides compelling data that supports the antiprion CNS activities reported for related IGA analogues [13].To broadly cover the diversity of amino acid side-chain physicochemical properties housed within our library of synthetic IGAs (and to minimise expenses associated with screening 32 compounds), 5 and 6-brominated IGAs with arginine (25 and 26), cystine (34), isoleucine (41), tryptophan (45 and 46), and tyrosine (51 and 52) were selected for screening in an in vitro affinity mass spectrometry (MS) assay.The synthetic indoles were incubated with α-syn for 3 h at a 5:1 molar ratio, after which a (+)-mode high-resolution MS spectrum was acquired.The MS binding results displayed additional peaks indicating that all the analogues tested formed protein-ligand complexes indicative of their binding with α-syn (Figures 6 and S134-S140).The α-syn binding displayed by all of the synthetic IGAs tested, 25, 26, 34, 41, 45, 46, 51, and 52, provides compelling data that supports the antiprion CNS activities reported for related IGA analogues [13].

Antiplasmodial Activity-Plasmodium falciparum
Malaria, a vector-borne parasitic disease transmitted to humans by mosquitoes, continues to burden the health of lower socioeconomic countries, with an estimated 241 million cases reported in 2020 [28,29].Traditionally, NP scaffolds have provided breakthrough treatments for malaria.However, ongoing drug resistance continues to emerge in the Plasmodium falciparum parasite, posing a significant threat to the effectiveness of current frontline drug treatments and highlighting the importance of discovering new ones.Despite this, the testing of MIAs for antiplasmodial activities has decreased by more than 50% since 2010 [3], a worrying trend considering the recent emergence of other zoonotic diseases (i.e., SARS-CoV-2 virus) and the importance of NP scaffolds for the development of drug leads to treat these diseases.
Although MNP IGAs previously examined for antiplasmodial activities were found to be inactive [16,17], the bioactivities reported for synthetic IGAs against another protozoan target Leishmania donovani provided additional rationale for expanded screening [13].We decided to explore the enhanced chemical diversity provided by our synthetic library, incorporating D and L-amino acids and brominated indole, in an attempt to expand our current knowledge of IGA antiplasmodial biological space.

Antiplasmodial Activity-Plasmodium falciparum
Malaria, a vector-borne parasitic disease transmitted to humans by mosquitoes, continues to burden the health of lower socioeconomic countries, with an estimated 241 million cases reported in 2020 [28,29].Traditionally, NP scaffolds have provided breakthrough treatments for malaria.However, ongoing drug resistance continues to emerge in the Plasmodium falciparum parasite, posing a significant threat to the effectiveness of current frontline drug treatments and highlighting the importance of discovering new ones.Despite this, the testing of MIAs for antiplasmodial activities has decreased by more than 50% since 2010 [3], a worrying trend considering the recent emergence of other zoonotic diseases (i.e., SARS-CoV-2 virus) and the importance of NP scaffolds for the development of drug leads to treat these diseases.
Although MNP IGAs previously examined for antiplasmodial activities were found to be inactive [16,17], the bioactivities reported for synthetic IGAs against another protozoan target Leishmania donovani provided additional rationale for expanded screening [13].We decided to explore the enhanced chemical diversity provided by our synthetic library, incorporating D and L-amino acids and brominated indole, in an attempt to expand our current knowledge of IGA antiplasmodial biological space.
The synthetic IGAs herein were screened for inhibition of chloroquine-sensitive (3D7) and chloroquine-resistant (Dd2) strains of the malarial parasite P. falciparum, and for cytotoxic effects against human embryonic kidney (HEK293) cells (Table 2).The aromatic D-amino acid IGA analogues, D-tryptophan (46 and 48) and D-tyrosine (50), possessed the highest inhibitory activity against both chloroquine-sensitive and resistant strains, approaching 100% inhibition at 60 µM.Further, 5-bromoindole-3-glyoxyl-D-tryptophan (46) was the most potent compound against both 3D7 and Dd2 (IC 50 7.4 and 8.2 µM, respectively) parasite strains and also showed excellent cellular safety towards HEK293 cells up to 60 µM.In fact, all of the IGAs tested were inactive against HEK293 cells.Some clear trends were observed for the antiplasmodial activities of brominated IGAs with larger aromatic amino acids side chains (tryptophan and tyrosine) possessing more potent growth inhibition compared to aliphatic ones, while D amino acids displayed higher potencies compared to L configured residues (D-arginine > L-arginine, D-tryptophan > L-tryptophan, and D-tyrosine > L-tyrosine).The activities obtained for the brominated arginine containing IGAs are consistent with those previously reported for the non-brominated arginine enantiomers of leptoclinidamine A (1), also tested against both 3D7 and Dd2 strains of P. falciparum [16].Moreover, bromination at C-5 or C-6 of indole does not appear to affect the antiplasmodial activities observed for the synthetic indoles examined herein.IA @ 10 µM IA @ 10 µM a 3D7 = chloroquine-sensitive P. falciparum strain, b Dd2 = chloroquine-resistant P. falciparum strain, c Human embryonic kidney cell line.

SARS-CoV-2 3CL Protease Activity
Despite the World Health Organisation declaring an end to the public health emergency caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in May 2023, the virus responsible for the coronavirus disease 2019 (COVID-19) pandemic continues its burden on global health and economic outcomes.With the continuing emergence of drug and vaccine-resistant variants, small-molecule antivirals remain critically important for treating acute infection.The SARS-CoV-2 3-chymotrypsin-like protease (3CL Pro ), synonymously known as the main protease (M Pro ), is an attractive small-molecule antiviral drug target for SARS-CoV-2 infection [30].SARS-CoV-2 3CL Pro is a highly conserved viral protease that, like other coronaviruses, translates its open reading frames into polyproteins that are hydrolysed into mature proteins [30][31][32].Functionalised indoles have been reported with good antiviral properties [13].Examples include synthetic C-2 substituted indole amides 62 and 63 displaying promising nanomolar inhibition of SARS-CoV-2 3CL Pro [33], while IGAs 64 and 65 were reported with nanomolar inhibition of HIV-1 along with low cellular toxicity (Figure 7) [34,35].To measure the potential antiviral activities of 25-56 against SARS-CoV-2 3CL Pro , the IGAs were assayed by measuring the viral protease catalysed cleavage of two fluorescent proteins in the presence and absence of dithiothreitol (DTT) [31].The 5-bromo-L-serine IGA (43) was the most potent inhibitor of SARS-CoV-2 3CL Pro (IC50 1.2 µM, Table S3), followed by brominated cystine dimers 33 and 34 (IC50 12.5 and 6.4 µM, respectively).Interestingly, 5-brominated-L-serine IGA 43 displayed inhibition 14 times more potent than its 6-brominated constitutional isomer 44, suggesting that bromination of indole at C-5 is favourable for the SARS-CoV-2 3CL Pro activity for serine-substituted IGAs.
In the presence of the reducing agent DTT (4 mM), all of the synthetic indoles displayed significantly weaker inhibition of the viral protease.A loss of inhibition in the presence of DTT is proposed to occur in compounds with thiol reactivities, indicating nonspecific covalent binding with cysteine proteases [32].We recently examined the effect of DTT on the structure of cyclotheonellazole D (66, Figure 8), a cyclic marine peptide incorporating an α-keto-β-amide (3-amino-4-methyl-2-oxohexanoic acid) similar to that found in IGAs [36]. 1 H NMR analyses of samples with and without DTT (4 mM) showed that the reducing agent had no effect on the structure of cyclotheonellazole D; however, its presence led to more than eightfold reduction in the inhibition of SARS-CoV-2 3CL Pro despite the marine peptide containing no obvious sites for thiol reactivity.Therefore, we urge caution when interpreting the SARS-CoV-2 3CL Pro results in the presence of reducing agents, particularly for compounds that might bind proteins allosterically and for those with α-keto-β-amide moieties, that for example, bind via covalent tetrahedral hemiketal complexes with serine proteases [37,38].To measure the potential antiviral activities of 25-56 against SARS-CoV-2 3CL Pro , the IGAs were assayed by measuring the viral protease catalysed cleavage of two fluorescent proteins in the presence and absence of dithiothreitol (DTT) [31].The 5-bromo-L-serine IGA (43) was the most potent inhibitor of SARS-CoV-2 3CL Pro (IC 50 1.2 µM, Table S3), followed by brominated cystine dimers 33 and 34 (IC 50 12.5 and 6.4 µM, respectively).Interestingly, 5-brominated-L-serine IGA 43 displayed inhibition 14 times more potent than its 6-brominated constitutional isomer 44, suggesting that bromination of indole at C-5 is favourable for the SARS-CoV-2 3CL Pro activity for serine-substituted IGAs.
In the presence of the reducing agent DTT (4 mM), all of the synthetic indoles displayed significantly weaker inhibition of the viral protease.A loss of inhibition in the presence of DTT is proposed to occur in compounds with thiol reactivities, indicating nonspecific covalent binding with cysteine proteases [32].We recently examined the effect of DTT on the structure of cyclotheonellazole D (66, Figure 8), a cyclic marine peptide incorporating an α-keto-β-amide (3-amino-4-methyl-2-oxohexanoic acid) similar to that found in IGAs [36]. 1 H NMR analyses of samples with and without DTT (4 mM) showed that the reducing agent had no effect on the structure of cyclotheonellazole D; however, its presence led to more than eightfold reduction in the inhibition of SARS-CoV-2 3CL Pro despite the marine peptide containing no obvious sites for thiol reactivity.Therefore, we urge caution when interpreting the SARS-CoV-2 3CL Pro results in the presence of reducing agents, particularly for compounds that might bind proteins allosterically and for those with α-keto-β-amide moieties, that for example, bind via covalent tetrahedral hemiketal complexes with serine proteases [37,38].To measure the potential antiviral activities of 25-56 against SARS-CoV-2 3CL Pro , the IGAs were assayed by measuring the viral protease catalysed cleavage of two fluorescent proteins in the presence and absence of dithiothreitol (DTT) [31].The 5-bromo-L-serine IGA (43) was the most potent inhibitor of SARS-CoV-2 3CL Pro (IC50 1.2 µM, Table S3), followed by brominated cystine dimers 33 and 34 (IC50 12.5 and 6.4 µM, respectively).Interestingly, 5-brominated-L-serine IGA 43 displayed inhibition 14 times more potent than its 6-brominated constitutional isomer 44, suggesting that bromination of indole at C-5 is favourable for the SARS-CoV-2 3CL Pro activity for serine-substituted IGAs.
In the presence of the reducing agent DTT (4 mM), all of the synthetic indoles displayed significantly weaker inhibition of the viral protease.A loss of inhibition in the presence of DTT is proposed to occur in compounds with thiol reactivities, indicating nonspecific covalent binding with cysteine proteases [32].We recently examined the effect of DTT on the structure of cyclotheonellazole D (66, Figure 8), a cyclic marine peptide incorporating an α-keto-β-amide (3-amino-4-methyl-2-oxohexanoic acid) similar to that found in IGAs [36]. 1 H NMR analyses of samples with and without DTT (4 mM) showed that the reducing agent had no effect on the structure of cyclotheonellazole D; however, its presence led to more than eightfold reduction in the inhibition of SARS-CoV-2 3CL Pro despite the marine peptide containing no obvious sites for thiol reactivity.Therefore, we urge caution when interpreting the SARS-CoV-2 3CL Pro results in the presence of reducing agents, particularly for compounds that might bind proteins allosterically and for those with α-keto-β-amide moieties, that for example, bind via covalent tetrahedral hemiketal complexes with serine proteases [37,38].

Mammalian Serine Protease (Chymotrypsin and Elastase) Activity
Serine proteases are a super-family of enzymes responsible for catalysing the hydrolytic cleavage of amide bonds in peptides and proteins via a nucleophilic active-site serine hydroxyl group [39].Elastases, proteolytic enzymes that degrade connective tissue, are involved in the pathogenesis of various inflammatory diseases and have been implicated in acute respiratory distress syndrome (ARDS) and acute lung injury (ALI) caused by viral infections such as SARS-CoV-2 [40,41].

Cytotoxicity against Human Breast, Ovarian, and Colon Cancer Cell Lines
The brominated IGAs (25-42 and 44-56) were tested for inhibition of cell growth of human breast (MDA-231), ovarian (OVCAR8), and colon (HCT-116) cancer cell lines.Consistent with the lack of cytotoxicity reported for other MNP IGAs, none of the synthetic compounds were cytotoxic at concentrations <20 µM.These results, along with the demonstrated cellular safety of IGAs against non-cancerous HEK293 cells (up to 60 µM), support their assessment in functional assays where cellular toxicity is unwanted.

Discussion
The marine-inspired brominated IGAs 25-56 synthesised herein presented promising bioactivities against several non-toxic disease targets.In addition, because IGAs are amenable to synthetic modification, this would likely improve their potencies against specific disease and infection targets.
The progression of neurodegenerative diseases, such as Parkinson's, Alzheimer's, Huntington's, and Creutzfeldt-Jakob disease, alongside metabolic disorders and cancers, has been linked to the accumulation of misfolded proteins and their aggregation into insoluble amyloid fibrils [42].The brominated IGAs tested were arginine (25 and 26), cystine (34), isoleucine (41), tryptophan (45 and 46), and tyrosine (51 and 52), all displayed promising binding activities with α-syn, consistent with clinically relevant antiprion CNS activities reported for other synthetic IGAs [43].We suggest further exploration of marine IGA scaffolds targeting amyloid protein diseases, particularly those IGAs substituted with aromatic amino acids, as excellent starting points, given the nanomolar antiprion activity reported for structurally related IGAs [13,43].
Based on the antileishmanial results reported for IGAs against the human blood parasite L. donovani, we would anticipate an improvement in the potency of antiplasmodial activity for brominated IGA amino acid analogues if they were converted to methyl esters and/or the tryptophan amino acid analogues cyclised with benzaldehydes to form aryl tetrahydro-β-carbolines (Figure 9) [13,44].Moreover, the SAR for antiplasmodial bioactivity obtained here clearly favoured D-aromatic amino acid IGA analogues 46, 48, and 50, scaffolds worthy of further modification and investigation against P. falciparum and other human pathogenic protozoan targets.Although only a handful of the IGAs synthesised herein exhibited meaningful inhibition of SARS-CoV-2 3CL pro , namely the cysteine dimers 33 and 34 and 5-bromo-serine IGA 43, further investigation of these scaffolds after synthetic modification is warranted.Because the IGAs 25-56 feature 5-or 6-brominated indoles, there is an excellent opportunity to increase their structural diversity using Suzuki coupling synthetic strategies (Figure 9).Furthermore, the C-terminus of the synthetic IGA scaffold is also available for coupling with additional amino acids or amines.These synthetic modifications are expected to expand both the size and structural diversity of marine IGA scaffolds while allowing for the exploration of favourable substrate interactions with the active site of the viral protease [30].
Molecules 2024, 29, x FOR PEER REVIEW 14 of 26 coupling synthetic strategies (Figure 9).Furthermore, the C-terminus of the synthetic IGA scaffold is also available for coupling with additional amino acids or amines.These synthetic modifications are expected to expand both the size and structural diversity of marine IGA scaffolds while allowing for the exploration of favourable substrate interactions with the active site of the viral protease [30].Similar synthetic modifications described for SARS-CoV-2 3CL pro could also form the basis for improving the specificity and potency of IGAs toward mammalian serine proteases.Based on our results, 5-brominated D-tryptophan IGA 46 is an ideal target (alongside other IGAs) for further exploration against chymotrypsin after synthetic modification that extends the IGA substrate at one or both ends utilising Suzuki and/or amide coupling reactions (Figure 9).Interestingly, none of the synthetic IGAs 25-46 displayed significant inhibition of elastase despite containing α-keto-β-amide functionality observed in other nanomolar potent marine NP inhibitors [36][37][38]41].Based on the formation of a tetrahedral hemiketal between the α-keto-β-amide and Ser195 of trypsin [37], we propose that the extended conjugation of indole α-keto-β-amide results in inefficient formation of this transition state, thereby compromising binding efficiency.Synthetic efforts that introduce the α-keto-β-amide distal to the indole C-3 to disrupt resonance with the α-keto-β-amide group could be expected to improve the binding efficiency of IGAs with serine proteases (Figure 9).Finally, although the non-toxic activities of the synthetic IGAs 25-56 were a primary focus of this work, it appears that N-substituted IGAs coupled with planar aromatic amines are a consistent structural feature required for antitumoral bioactivities [13].Similar synthetic modifications described for SARS-CoV-2 3CL pro could also form the basis for improving the specificity and potency of IGAs toward mammalian serine proteases.Based on our results, 5-brominated D-tryptophan IGA 46 is an ideal target (alongside other IGAs) for further exploration against chymotrypsin after synthetic modification that extends the IGA substrate at one or both ends utilising Suzuki and/or amide coupling reactions (Figure 9).Interestingly, none of the synthetic IGAs 25-46 displayed significant inhibition of elastase despite containing α-keto-β-amide functionality observed in other nanomolar potent marine NP inhibitors [36][37][38]41].Based on the formation of a tetrahedral hemiketal between the α-keto-β-amide and Ser195 of trypsin [37], we propose that the extended conjugation of indole α-keto-β-amide results in inefficient formation of this transition state, thereby compromising binding efficiency.Synthetic efforts that introduce the α-keto-βamide distal to the indole C-3 to disrupt resonance with the α-keto-β-amide group could be expected to improve the binding efficiency of IGAs with serine proteases (Figure 9).Finally, although the non-toxic activities of the synthetic IGAs 25-56 were a primary focus of this work, it appears that N-substituted IGAs coupled with planar aromatic amines are a consistent structural feature required for antitumoral bioactivities [13].

Chemistry
All anhydrous reaction and bulk solvents and reagents were purchased from Merck (Bayswater, Victoria, Australia) except for 5-and 6-bromoindole (Enamine), and bulk solvents were distilled prior to use.All reactions were carried out under inert N 2 or argon atmospheres under standard reaction conditions.Reactions were heated using a magnetic stirrer hotplate and appropriate heating block.UV spectra were recorded on a Shimadzu UV-1800 spectrophotometer (Kyoto, Japan), optical rotations were acquired on a JASCO P-1020 polarimeter (Kingsgrove, NSW, Australia) with [α] D values given in 10 −1 deg cm 2 g −1 , and IR data was obtained in MeOH using a Thermo Fisher Scientific Nicolet iS5 spectrometer (Parkville, Victoria, Australia) equipped with an iD5 ATR accessory (Thermo Fisher).NMR spectra were recorded on an 800 MHz Bruker ® Avance III HDX equipped with a Triple Resonance 5 mm Cryoprobe with Z-gradient and automatic tuning with a SampleJet automatic sample changer, a Bruker Avance III 500 MHz spectrometer (BBFO Smartprobe, 5 mm 31P-109Ag), or a Bruker ® 400 MHz Avance III spectrometer equipped with a 5 mm BBFO probe with Z-gradient and automatic tuning with a SampleCase automatic sample changer.All NMR spectrometers were controlled using IconNMR™ automation software version 5.1 (all Bruker equipment from Bruker Pty Ltd., Preston, Victoria, Australia), with all spectra acquired at 25 • C using standard Bruker pulse sequences.The 1 H and 13 C chemical shifts were referenced to the residual d 6 -DMSO (Cambridge isotope Laboratory, Inc., Heidelburg West, Victoria, Australia) solvent peaks, δ H 2.50 and δ C 39.52 respectively, and processed using the Mnova™ software suite (version 12.0).MS spectra and MS/MS experiments were obtained on an Agilent Technologies 5530 accurate mass QTOF LC/MS controlled by Mass Hunter data acquisition and analytical software (Agilent Technologies, version B.04.00, Mulgrave, Victoria, Australia) and equipped an Agilent Technologies 1290 infinity series binary pump and autosampler (Mulgrave, Victoria, Australia).HPLC fractions were dried using a Genevac HT-12 Series II evaporation system (LabGear Australia, South Melbourne, Victoria, Australia).All solvents used for HPLC were Lab-Scan HPLC grade, and H 2 O was Millipore Milli-Q PF (Bayswater, Victoria, Australia) filtered, while solvents used for the performance of MS were LCMS grade.

Synthetic Procedures 4.2.1. Synthesis of Indol-3-yl-Glyoxlyamides 25-56
Under an inert atmosphere oxalyl chloride (30 µL, 0.35 mmol) was added to a solution of 5-or 6-bromoindole (25 mg, 0.13 mmol) in diethyl ether (1.5 mL) and the reaction mixture stirred at room temperature for 45 min.The reaction mixture was heated to 50 • C to remove diethyl ether over 15 min, then resuspended in DMF (1.5 mL) and anhydrous pyridine (50 µL, 0.50 mmol) and heated to 90 • C.After 15 min, the appropriate D or L amino acid (0.2 mmol) was added, and the crude reaction mixture was stirred overnight at 90 • C. Following this, the reaction mixture was cooled to room temperature and partitioned repeatedly between EtOAc and water, with the organic phase dried over sodium sulfate, filtered, and concentrated to dryness.

Cheminformatics Analyses of Marine and Synthetic Indole Chemical Diversity
The structures for 2048 MIAs reported in the MarinLit database (up to the end of 2021) were imported into the freely available cheminformatics software Osiris DataWarrior (version 6.01.05) [8,14].The reported biological activities were manually obtained from biological testing data reported in the primary literature for the 2048 compounds, which were scaled to disease and infection targets and potency of activity classifications in Table S2 [3].The chemical diversity of MIAs, bioactive synthetic IGAs (n = 147), and the synthetic IGAs 25-56 (n = 32) was visualised using a 50 × 50 neuron self-organising map (SOM) and the SkeletonSpheres chemical descriptor (1024 bin resolution non-binary byte array/vector).

SARS-CoV-2 3CL pro Inhibition Assays
The assays for measuring inhibition of SARS-CoV-2 3CL pro have been previously reported [31].Reactions contained 0.01% Triton X100 to minimise compound and enzyme aggregation [46] and were performed with and without 4 mM DTT [32].All protease assays were performed in 96-well plates in 50 mM Tris, 1 mM EDTA in 200 µL total at pH 7.5.Each well contained 25 nM of SARS-CoV-2 3CL Pro and eCFP-Venus biosensor (500 nM).After the sequential addition of SARS-CoV-2 3CL Pro , buffer (100 µL) and compounds 25-56 to each well, the assay was initiated by adding eCFP-Venus biosensor (100 µL).The plates were incubated at 30 • C for 4 h, with fluorescence measured using an excitation wavelength of 434 nm, and emission wavelengths of 477 and 528 nm, respectively.HTS was performed in duplicate using 5 µM of each compound.Z' score was calculated using the sample means and standard deviations for positive and negative controls.The dose-response for selected hits was performed using a concentration ranging from 0.001 µM to 20 µM and included a sample with no added compound.All data processing was performed in R (version 4.0.2).

Serine Protease Assays
Serine protease-inhibitory activities for 25-56 were determined in 96 well plates.Each inhibitor was initially tested in duplicate against elastase and chymotrypsin at 10 µM.Where inhibitory activity was observed at this concentration, further experiments with multiple dilutions of the inhibitor were undertaken to determine the IC 50 .Assays for the inhibition of elastase were undertaken using high-purity porcine pancreatic elastase (Sigma, Melbourne, Victoria, Australia, E0258) in an assay buffer of 50 mM Tris-HCl (pH 8.0) at room temperature.Dilutions of each inhibitor (originally resuspended in DMSO) were prepared in triplicate and then preincubated with 375 ng of elastase for 15 min at room temperature in a total reaction volume of 85 µL.Following this preincubation, 15 µL of 2 mM N-succinyl-Ala-Ala-Ala-p-nitroanilide (Sigma, S4760) in assay buffer was added to each well and kinetic measurement of absorbance at 405 nm (Tecan Spark Plate Reader, Zürich, Switzerland) was undertaken for 30 min.Assays for the inhibition of chymotrypsin were undertaken at 37 • C using α-chymotrypsin from bovine pancreas (Sigma, C4129) in an assay buffer of 50 mM Tris-HCl, 100 mM NaCl, and 1 mM CaCl 2 (pH 7.8).Dilutions of each inhibitor were prepared in triplicate and preincubated with 5 µg of chymotrypsin for 10 min at 37 • C in a total reaction volume of 75 µL.After preincubation, 25 µL of 2 mM N-succinyl-Gly-Gly-Phe-p-nitroanilide (Sigma, S1899) was added to each well and kinetic measurement of absorbance at 405 nm was undertaken for 30 min.Inhibitory activity was determined by comparison of the slope of the reaction curve for multiple dilutions of the inhibitor to that of an equivalent solvent control and expressed as IC 50 ± standard deviation.

Human Cancer Cell Cytotoxicity Assay
Synthetic 25-42 and 44-56 (prepared at 20 mM stock concentration in DMSO and a 14 pt CRC) and controls 0.4% DMSO/10 µM puromycin were added 24 h after cell seeding and incubated for 72 h.After 66 h, resazurin was added to a final concentration of 60 µM.Fluorescence was monitored (excitation and emission wavelengths 530 and 590 nm, respectively) using a Tecan Spark Plate Reader (Zürich, Switzerland) at 72 h.Data was normalised to DMSO and puromycin controls (10 µM).HCT-116 cells were cultured in McCoy's 5A media supplemented with 10% heat-inactivated FBS.For this assay, 500 cells/well were seeded into Greiner black-wall, clear-bottom 384-well cell culture plates.MDA-MB-231 cells were cultured in DMEM media supplemented with 10% heatinactivated FBS and 10 mM HEPES and 2000 cells/well seeded into Greiner black wall, clear bottom 384 well cell culture plates for compound evaluation.OVCAR8 cells were cultured in RPMI media supplemented with 10% heat-inactivated FBS, and 400 cells/well seeded into Greiner black wall, clear bottom 384 well cell culture plates used for the assay.

Conclusions
This work highlights the value of orthogonal approaches, including meta-analyses of NP bioactivities, cheminformatics tools, and NP synthesis, in expanding our knowledge of NP bioactive chemical space.Incorporating bioactivity data obtained from synthetic libraries of related structures has also been insightful in directing the choice of screening targets for marine NP-inspired IGAs.The new bioactivities obtained herein for 25-56, amyloid protein binding, antiplasmodial, and protease inhibitory activities, support more directed NP investigations that aim to broaden the scope of NP biological testing.Additionally, we have outlined synthetic approaches that explore further structural modification of marine IGA scaffolds to improve their potential applications toward several disease and infection targets.By exploring large NP databases with cheminformatics tools alongside the inclusion of functional assays, we can enhance the impact of NPs and NP scaffolds for future drug discovery and development.

Figure 9 .
Figure 9. Potential sites for synthetic modification of brominated IGA scaffolds (red circles) to improve potency and disease target specificity.Antiprotozoal activity; R 1 = esterification, mammalian and viral proteases; R 2 = Suzuki-coupling to bromine (at C-5 or C-6 of indole) and/or R 3 = amide coupling (amines or amino acids), and antitumoral activity; R 4 = N-alkylation.